Search CORE

4 research outputs found

Visual Guidance for Unmanned Aerial Vehicles with Deep Learning

Author: Dong Xingshuai
Publication venue: UNSW, Sydney
Publication date: 01/01/2023
Field of study

Unmanned Aerial Vehicles (UAVs) have been widely applied in the military and civilian domains. In recent years, the operation mode of UAVs is evolving from teleoperation to autonomous flight. In order to fulfill the goal of autonomous flight, a reliable guidance system is essential. Since the combination of Global Positioning System (GPS) and Inertial Navigation System (INS) systems cannot sustain autonomous flight in some situations where GPS can be degraded or unavailable, using computer vision as a primary method for UAV guidance has been widely explored. Moreover, GPS does not provide any information to the robot on the presence of obstacles. Stereo cameras have complex architecture and need a minimum baseline to generate disparity map. By contrast, monocular cameras are simple and require less hardware resources. Benefiting from state-of-the-art Deep Learning (DL) techniques, especially Convolutional Neural Networks (CNNs), a monocular camera is sufficient to extrapolate mid-level visual representations such as depth maps and optical flow (OF) maps from the environment. Therefore, the objective of this thesis is to develop a real-time visual guidance method for UAVs in cluttered environments using a monocular camera and DL. The three major tasks performed in this thesis are investigating the development of DL techniques and monocular depth estimation (MDE), developing real-time CNNs for MDE, and developing visual guidance methods on the basis of the developed MDE system. A comprehensive survey is conducted, which covers Structure from Motion (SfM)-based methods, traditional handcrafted feature-based methods, and state-of-the-art DL-based methods. More importantly, it also investigates the application of MDE in robotics. Based on the survey, two CNNs for MDE are developed. In addition to promising accuracy performance, these two CNNs run at high frame rates (126 fps and 90 fps respectively), on a single modest power Graphical Processing Unit (GPU). As regards the third task, the visual guidance for UAVs is first developed on top of the designed MDE networks. To improve the robustness of UAV guidance, OF maps are integrated into the developed visual guidance method. A cross-attention module is applied to fuse the features learned from the depth maps and OF maps. The fused features are then passed through a deep reinforcement learning (DRL) network to generate the policy for guiding the flight of UAV. Additionally, a simulation framework is developed which integrates AirSim, Unreal Engine and PyTorch. The effectiveness of the developed visual guidance method is validated through extensive experiments in the simulation framework

UNSWorks

Monocular Visual-IMU Odometry Using Multi-channel Image Patch Exemplars

Author: Dong Junyu
Dong Xinghui
Dong Xingshuai
He Bo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The University of Manchester - Institutional Repository

Monocular Visual-IMU Odometry: A Comparative Evaluation of Detector-Descriptor-Based Methods

Author: Huiyu Zhou (535329)
Junyu Dong (759855)
Xinghui Dong (7772624)
Xingshuai Dong (8678313)
Publication venue
Publication date: 06/06/2019
Field of study

Monocular visual-IMU (Inertial Measurement Unit) odometry has been widely used in various intelligent vehicles. As a popular technique, detector-descriptor based visual-IMU odometry is effective and efficient due to the fact that local descriptors are robust against occlusions, background clutter and abrupt content changes. However, to our knowledge, there is not a comprehensive and comparative evaluation study on the performance of different combinations of detectors and descriptors recently developed. In order to bridge this gap, we conduct such a comparative study in a unified framework. In particular, six typical routes with different lengths, shapes and road scenes are selected from the well-known KITTI dataset. Firstly, we evaluate the performance of different combinations of salient point detectors and local descriptors using the six routes. Finally, we tune the parameters of the best detector or descriptor obtained for each route, to achieve better results. This study provides not only comprehensive benchmarks for assessing various algorithms, but also instructive guidelines and insights for developing detectors and descriptors to handle different road scenes.</p

Leicester Research Archive

Monocular visual-IMU odometry using multi-channel image patch exemplars

Author: A Geiger
A Geiger
AI Mourikis
AJ Davison
B Kitt
B Liu
Bo He
C Harris
D Nistér
D Rodriguez
D Scaramuzza
DG Lowe
E Rosten
H Bay
J Canny
JHE Cartwright
JS Hu
Junyu Dong
M Bloesch
M Li
M Varma
MA Fischler
P Barros
P Corke
P Dollár
P Dollár
P Piniés
R Hartley
S Gauglitz
S Gupta
S Shen
S Sirtkaya
S Wold
X Dong
Xinghui Dong
Xingshuai Dong
Y Cheng
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref